-
Notifications
You must be signed in to change notification settings - Fork 495
support gdr for intel xpu #11005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support gdr for intel xpu #11005
Conversation
WalkthroughAdds a ZE (Level Zero) GPUDirect RDMA driver presence check (guarded by Changes
Sequence Diagram(s)sequenceDiagram
participant Init as uct_ib_md_open_common()
participant KFD as ROCm KFD probe
participant ZE as ZE GPUDirect probe (new)
participant DMB as DMA-BUF/module checks
Note over Init: IB MD init sequence (high-level)
Init->>KFD: probe ROCm KFD
KFD-->>Init: result
alt HAVE_ZE
Init->>ZE: probe ZE GPUDirect driver
ZE-->>Init: result (may set GPUDIRECT_RDMA)
end
Init->>DMB: continue DMA-BUF/module checks
DMB-->>Init: continue initialization
sequenceDiagram
participant Query as uct_ze_copy_md_query()
participant Attr as md_attr
Note over Query: build access_mem_types mask
Query->>Attr: set = ZE_HOST | ZE_DEVICE | ZE_MANAGED | HOST
Attr-->>Query: return md attributes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (2)
🔇 Additional comments (1)
Comment |
3b374b7 to
b145428
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/uct/ib/base/ib_md.c(1 hunks)src/uct/ze/copy/ze_copy_md.c(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/uct/ze/copy/ze_copy_md.c
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: UCX PR (Codestyle commit title)
- GitHub Check: UCX PR (Codestyle ctags check)
- GitHub Check: UCX PR (Codestyle codespell check)
- GitHub Check: UCX PR (Codestyle AUTHORS file update check)
- GitHub Check: UCX PR (Codestyle format code)
- GitHub Check: UCX release DRP (Prepare CheckRelease)
- GitHub Check: UCX release (Prepare CheckRelease)
- GitHub Check: UCX snapshot (Prepare Check)
* UCT/ZE: Add host memory type * UCT/IB: Add GDR support for intel XPU
b145428 to
98fce7d
Compare
* UCT/ZE: Add host memory type * UCT/IB: Add GDR support for intel XPU * AUTHORS: Add author
613fbb8 to
ca2a77e
Compare
* UCT/ZE: Add host memory type * UCT/IB: Add GDR support for intel XPU * AUTHORS: Add author
What?
Currently it can not support ZE backend for GDR usage.
Why?
On GDR mode, it will improve the performance to reduce the data transfer with extra memcpy.
How?
Now we can implement it based it use the ZE backend's dmabuf feature and IB's GDR support
Summary by CodeRabbit
New Features
Documentation